Perturbation method for deleting redundant inputs of perceptron networks
نویسندگان
چکیده
Multilayer feedforward networks are often used for modeling complex functional relationships between data sets. Should a measurable redundancy in training data exist, deleting unimportant data components in the training sets could lead to smallest networks due to reduced–size data vectors. This reduction can be achieved by analyzing the total disturbance of network outputs due to perturbed inputs. The search for redundant input data components proposed in the paper is based on the concept of sensitivity in linearized models. The mappings considered are I K with continuous and differentiable outputs. Criteria and algorithm for inputs’ pruning are formulated and illustrated with examples. INTRODUCTION Neural networks are often used to model complex functional relationships between sets of experimental data. Such modeling approach proves useful when analytical models of processes do not exist or are not known, but when sufficient data is available for embedding existing relationships into neural network structures. Multilayer feedforward neural networks (MFNN) consisting of continuous neurons have been found particularly useful for such model building [1–3]. Representative training data are used in such case for supervised training of a suitable user–selected MFNN architecture. Minimization of potential redundancy in data used for supervised training can take different forms. Duplicative data pairs are essentially removable from the training sets without a loss of accuracy. In contrast, special attention should be paid to data that carry conflicting information. Such data do not normally allow for unique mapping and should be eliminated. Our concern in this paper is This work was supported in part by the ONR Grant N00014–93–1–0855 to explore potential redundancy in input vector dimensionality. As such, this concern has only little in common with widely used notion of network pruning. By deleting superfluous inputs, if such inputs exist, the number of input nodes is reduced. The resulting network is still pruned as it contains no weights fanning out of deleted inputs. A popular objective of network pruning is to detect irrelevant weights and neurons. This can be achieved through evaluation of sensitivities of the error function to the weights which are the learning parameters [5–6]. Errors other than quadratic are often used to achieve identification of insensitive weights. Statistical moments of neural networks–built mappings, including sensitivities to inputs, are discussed in [7]. Our focus in the paper is mainly to develop clear and practical measures of sensitivities to inputs rather than to weights or neurons. Then, a systematic algorithmic approach has been developed to utilize these measures towards deletion of redundant inputs. To determine which inputs are necessary for the satisfactory neural network performance a metric known as saliency was introduced in [8]. Belue and Bauer developed an algorithm extending the saliency metric over the entire input space [9]. The approach involves multiple neural network training and superposition of noise on the training patterns to reduce the dependence of results on local minima. This method, however, is computationally intensive due to the required multiple training sessions and exhaustive coverage of the input space. The saliency method was developed to determine the irrelevant features for neural network classifiers [9]. The sensitivities of MFNN outputs with respect to inputs are calculated and used along with various metrics to evaluate importance of features. Such classifier networks in general are characterized by small sensitivities, when fully trained. Therefore saliency can be applied only with the addition of noise to the training patterns and with sampling of the input space over the whole domain. Multiple training is necessary to average the results and prevent dependence on local minima achieved during training. This paper focuses on the concept of sensitivity, or perturbation method, for pruning unimportant inputs for neural networks providing continuous mapping. This assumption and the proposed new sensitivity summation metrics allow application of the method directly to trained MFNNs without adding noise to the training patterns or multiple trainings. In fact, in case of continuous mapping the problem of local minima reached during training is not important if sufficient approximation accuracy is achieved. As a result the presence of local minima does not affect the Jacobian matrix used by this method. The Jacobian matrix is derived from the approximate neural network mapping over the training data set. This eliminates the need for computationally intensive repetitive training. In addition to mappings with continuous outputs the sensitivity method can be applied to the classification problems. However, in such cases an additional neural network has to be trained as described in one of the examples. Let us consider an MFNN with a single hidden layer. The network is assumed to perform a nonlinear, differentiable mapping : I K, o= (x), where o (Kx1), and x(Ix1) are output and input vectors, respectively. In further discussion it is assumed that certain inputs bear none, or little statistical or deterministic relationships to output vectors, and are therefore removable. The objective here is to reduce the original dimensionality of the input vector, x, so that a smaller network can be used as a model without loss of accuracy. Initial considerations published in [10–12] are extended below along with a formal framework for the perturbation approach as applied to the neural network models. Let o: I K with component functions o1, o2, ..., oK. Suppose x (n) , where is an open set. Since o is differentiable at x(n) we have o(x x) o x J x x g( x) (1)
منابع مشابه
Kinematic Synthesis of Parallel Manipulator via Neural Network Approach
In this research, Artificial Neural Networks (ANNs) have been used as a powerful tool to solve the inverse kinematic equations of a parallel robot. For this purpose, we have developed the kinematic equations of a Tricept parallel kinematic mechanism with two rotational and one translational degrees of freedom (DoF). Using the analytical method, the inverse kinematic equations are solved for spe...
متن کاملUse of Artificial Neural Networks and PCA to Predict Results of Infertility Treatment in the ICSI Method
Background: Intracytoplasmic sperm injection (ICSI) or microinjection is one of the most commonly used assisted reproductive technologies (ART) in the treatment of patients with infertility problems. At each stage of this treatment cycle, many dependent and independent variables may affect the results, according to which, estimating the accuracy of fertility rate for physicians will be difficul...
متن کاملA Provably Convergent Dynamic Training Method for Multi-layer Perceptron Networks
This paper presents a new method for training multi-layer perceptron networks called DMP1 (Dynamic Multi-layer Perceptron 1). The method is based upon a divide and conquer approach which builds networks in the form of binary trees, dynamically allocating nodes and layers as needed. The individual nodes of the network are trained using a gentetic algorithm. The method is capable of handling real...
متن کاملOptimal Input Selection of Neural Networks by Sensitivity Analysis and Its Application to Image Recognition
This paper describes a method of selecting optimal inputs of neural networks without lowering recognition rate. In general, neural networks learn a recognition algorithm from inputs and their desired outputs. But if some inputs are redundant, namely they are expressed by other inputs fed to the networks or they do not contribute to recognition, their effect on the recognition algorithm, or thei...
متن کاملNeural Networks with Complex and Quaternion Inputs
Many neural network architectures operate only on real data and simple complex inputs. But there are applications where considerations of complex and quaternion inputs are quite desirable. Prior complex neural network models have generalized the Hopfield model, backpropagation and the perceptron learning rule to handle complex inputs. The Hopfield model for inputs and outputs falling on the uni...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Neurocomputing
دوره 14 شماره
صفحات -
تاریخ انتشار 1997